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Abstract 


—  Software  verification  is  the  process  of  determining 
whether  a  piece  of  software  is  reliable — whether  it  performs 
as  it  is  supposed  to.  As  traditionally  performed,  program 
verification  can  account  for  40  percent  or  more  of  the  de¬ 
velopment  time  and  cost  of  a  software  product.  In  spite  of 
this  fact,  released  software  is  notorious  for  its  unrelia¬ 
bility.  These  two  facts,  the  expense  of  our  attempts  at 
program  verification  and  our  limited  success,  have  sustained 
a  great  deal  of  research  interest  directed  at  finding  more 
effective  methods. 

This  thesis  develops  extensions  to  a  promising  new  ver¬ 
ification  technique  called  Partition  Analysis,  developed  by 
Debra  J.  Richardson  (1981).  Partition  Analysis  appears  to 
be  a  powerful  approach  for  identifying  program  faults,  but 
in  its  current  state  can  only  be  applied  to  single  program 
modules  that  produce  no  side  effects,  including  input  or 
output.  This  thesis  extends  the  applicability  of  Partition 
Analysis  by  permitting  the  use  of  procedure  and  function 
calls,  thereby  allowing  complete  programs  to  be  analyzed. 

The  result  is  a  set  of  techniques  for  handling  regular,  non¬ 
recursive  procedure  and  function  calls,  separate  methods  for 
the  analysis  of  recursive  procedures  and  functions,  and  an 
approach  to  the  larger  problem  of  analyzing  entire  programs. 
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1*  Introduction 

Program  verification  is  the  process  of  determining 
whether  a  piece  of  software  is  reliable--whether  it  performs 
as  it  is  supposed  to.  As  traditionally  performed,  program 
verification  can  account  for  40  percent  or  more  of  the  de¬ 
velopment  time  and  cost  of  a  software  product  (Pressman, 
1987:  467).  In  spite  of  this  fact,  released  software  is  no¬ 
torious  for  its  unreliability  (Pressman,  1987:  13-14). 

These  two  facts,  the  expense  of  our  attempts  at  program  ver¬ 
ification  and  our  limited  success,  have  sustained  a  great 
deal  of  research  interest  directed  at  finding  more  effective 
methods.  This  thesis  is  a  study  of  a  relatively  new  tech¬ 
nique  for  program  verification  called  Partition  Analysis 
(Richardson,  1981).  The  Partition  Analysis  technique  shows 
great  promise,  but  in  its  current  form  is  severely  limited 
in  its  scope  of  applicability.  The  contribution  this  thesis 
makes  is  to  expand  that  scope. 

Background 

General^  Approaches  to  Program  Verification .  In  the 
nest  of  all  possible  worlds,  the  generation  of  software 
would  be  a  fully  automated  procedure.  The  user  or  other  in¬ 
terested  oarty  would  prepare  a  specification  detailing  the 


desired  functional  and  operational  characteristics  of  the 
software  and  would  present  it  to  a  system  that  would  auto¬ 
matically  generate  a  program  guaranteed  to  match  the  speci¬ 
fication.  Automatic  programming,  as  this  procedure  is 
called,  is  one  of  the  long-standing  goals  of  the  Artificial 
Intelligence  (AI)  field  but,  not  surprisingly,  this  ideal  is 
far  from  realization:  prograinming  is  a  complex,  knowledge- 
intensive,  incompletely  understood  and  error  prone  process 
even  for  human  experts.  Efforts  toward  automating  the  pro¬ 
cess  have  been  categorized  into  three  subareas: 

1.  AI  programming  environments; 

2.  studies  of  the  software  design  process;  and 

3.  knowledge-based  software  assistants. 

(Mostow,  1985:  1253) 

Automatic  programming  remains  a  long-range  goal,  but  is  not 
expected  to  solve  the  verification  problem  in  the  foresee¬ 
able  future. 

If  automated  programming  is  currently  beyond  our  capa¬ 
bilities,  the  next  best  thing  would  be  to  write  a  program 
and  then  apply  some  method  (again,  preferably  an  automated 
one)  to  show  that  it  meets  its  specification.  In  other 
words,  we  would  like  to  develop  a  proof  of  the  correctness 
of  our  program,  with  full  mathematical  rigor  and  certainty. 

An  early  and  still  active  approach  proceeds  by  defining 
a  formal  semantics  for  a  programming  language  in  the  form  of 
a  set  of  axioms.  Programs  in  that  language  can  then  be 


translated  into  assertions  in  the  predicate  calculus,  and 
the  correctness  of  the  program — its  correspondence  with  a 
specification  likewise  expressed  in  the  predicate  calculus- 
becomes  a  theorem  to  be  proved  (Hoare,  1969).  Unfortu¬ 
nately,  both  the  translation  and  the  proof  are  tedious  and 
error  prone,  and  not  easily  automated.  It  has  also  been 
noted  by  several  researchers  that  the  proper  way  to  apply 
formal  proof  techniques  is  to  develop  the  program  and  the 
proof  concurrently,  or  even  to  derive  the  program  from  the 
proof  (Dunn,  1984:  159).  This  further  complicates  attempts 
at  automating  the  approach,  which  as  a  manual  technique  re¬ 
mains  impractical  for  all  but  the  smallest  programs  (Dunn, 
1984:  159). 

Given  the  current  impossibility  of  automatic  program¬ 
ming  and  the  extreme  difficulty  of  applying  formal  tech¬ 
niques,  a  third  approach  toward  achieving  at  least  some  con 
fidence  in  the  reliability  of  our  software  is  testing:  run 
ning  a  program  with  some  subset  of  the  data  it  is  supposed 
to  handle  and  checking  the  result.  As  Dijkstra  has  pointed 
out,  "testing  can  only  reveal  the  presence  of  errors,  never 
their  absence"  (Dijkstra,  1972).  It  has  been  shown  that 
anything  short  of  exhaustive  testing  (running  the  program 
with  every  possible  set  of  inputs)  leaves  open  the  possibil 
ity  of  an  incorrect  program  escaping  detection  by  working 
correctly  on  the  subset  of  data  tested  (Weyuker  and  Ostrand 
1980).  In  spite  of  this  rather  dismal  fact,  a  carefully 


chosen  set  of  test  data  can  reveal  many  of  the  errors  in  a 
program,  and  a  test  run  resulting  in  no  errors  detected  can 
greatly  increase  confidence  in  the  reliability  of  software. 

Software  testing,  done  informally,  was  the  first  tech¬ 
nique  applied  to  verifying  software.  Over  the  years  a  large 
number  of  techniques  have  been  developed  trying  to  make  it 
more  effective  and  efficient.  Many  of  these  have  also  been 
automated,  at  least  in  part. 

Symbolic  execution  is  a  verification  technique  that 
combines  formal  verification  and  testing.  In  symbolic  exe¬ 
cution  input  values  are  represented  by  symbols  instead  of 
literal  values  and  statements  are  executed  symbolically  to 
produce  formulas  for  the  values  of  the  program  variables. 
These  formulas  can  then  be  analyzed  for  correctness  or  to 
guide  the  selection  of  ordinary  test  data.  Several  auto¬ 
mated  systems  for  symbolic  execution  have  been  developed. 

Partition  Analysis 

Partition  Analysis  is  a  technique  developed  by  Debra 
Richardson  in  her  doctoral  research  that  combines  both  for¬ 
mal  verification  techniques  and  testing  in  order  to  acquire 
confidence  in  a  program's  reliability  (Richardson,  1981). 

It  is  also  distinctive  in  that  it  makes  extensive  use  of 
both  the  speci f ication  and  implementation  of  a  design. 

As  currently  developed.  Partition  Analysis  can  do  ap¬ 


plied  to  single  modules  that  do  not  produce  side  effects 
(including  I/O).  It  is  also  directed  primarily  at  numerical 
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algorithms  and  has  not  been  tried  on  programs  that  perform 
symbolic  processing. 

The  Partition  Analysis  method  consists  of  three  steps. 
First,  symbolic  evaluation  and  other  analysis  techniques  are 
used  to  produce  the  procedure  partition  of  the  module.  Each 
element  of  this  partition  defines  a  subdomain  of  the  mod¬ 
ule's  input  and  describes  the  computation  to  be  performed  on 
this  domain  according  to  the  specification  and  also  accord¬ 
ing  to  the  implementation.  Second,  proof  techniques  are  ap¬ 
plied  to  the  two  computation  descriptions  to  demonstrate 
their  equivalence  (or  nonequivalence,  in  which  case  a  fault 
has  been  found).  Third,  the  subdomain  and  computation  de¬ 
scriptions  are  used  to  guide  the  selection  of  test  data  to 
exercise  the  functional  behavior  of  the  module. 


Prople^i  and  Approach 

Evaluation  of  Partition  Analysis  by  Richardson  and 
Clarxe  (1985)  shows  it  to  be  very  effective  at  finding  pro¬ 
gram  faults.  The  restriction  to  single  modules  with  no  side 
effects,  even  I/O,  however,  drastically  li  its  its  useful¬ 
ness.  As  currently  developed  it  remains  an  academic  exer¬ 
cise,  but  one  with  great  potential.  This  thesis  attempts  to 
realize  some  of  that  potential  by  enhancing  Partition  Analy¬ 
sis  to  Tsxe  it  applicable  to  a  larger  class  of  prog r a  ns.  In 

particular ,  the  objective  chosen  was  to  devise  not hods  for 
mncliag  procedure  and  function  calls.  Since  most  oroura 


ming  languages  permit  recursive  calls,  the  special  case  of 
recursion  was  specifically  include  in  the  objective. 

Procedures  and  functions  are  the  building  blocks  of  all 
practical  programs.  Expanding  Partition  Analysis  to  permit 
their  use  make  the  technique  applicable  to  entire  programs 
rather  than  just  isolated  modules.  It  was  felt  that  this 
extension  would  therefore  be  more  useful  than,  say,  the  in¬ 
clusion  of  I/O  or  the  elaboration  of  any  of  the  methods  al¬ 
ready  used  internally  by  Partition  Analysis. 

The  approach  followed  was  first  of  course  to  understand 
as  fully  as  possible  the  Partition  Analysis  method  itself. 
Since  Partition  Analysis  used  many  other  verification  tech¬ 
niques,  an  extensive  review  of  these  techniques  was  neces¬ 
sary.  The  basic  method  for  handling  procedure  and  function 
calls  was  devised  by  a  close  examination  of  symbolic  execu¬ 
tion,  the  particular  needs  of  Partition  Analysis,  and  some 
hints  in  Richardson's  dissertation  (1981)  itself.  Separate 
methods  for  recursive  calls  were  developed  by  analoqy  with 
Richardson's  methods  for  analyzing  program  loops.  The  new 
methods  focus  on  step  one  of  Partition  Analysis,  forming  the 
procedure  partition  when  procedure  and  function  calls  are 
present.  The  formal  verification  and  testing  steps  are  also 
affected,  but  to  a  lesser  degree. 

The  result  is  a  set  of  techniques  for  handling  nonre¬ 
cursive  procedure  and  function  calls;  the  choice  of  which 
technique  to  use  depending  on  the  proqram  to  be  analyzed. 


Recursive  calls  are  handled  by  thexr  own  technique,  again 
with  variations.  Several  examples  of  recursive  and  nonre¬ 
cursive  routines  were  analyzed  to  verify  that  the  proposed 
techniques  do  work. 


Overview  of  the  Rest  of  the  Thesis 


The  remainder  of  the  thesis  consists  of  four  chapters. 
Chapter  2  is  a  close  look  at  many  of  the  techniques  that 
nave  been  developed  for  software  testing,  and  a  look  at  sym¬ 
bolic  execution.  Chapter  3  explains  the  Partition  Analysis 
method  in  detail,  with  a  detailed  example.  Chapter  4  pre¬ 
sents  the  extensions  that  were  made  to  the  method.  Chapter 
5  offers  some  conclusions  about  the  method  and  points  to 
several  directions  for  future  work. 


Vi V 
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,11^  Literature  Review 

Numerous  techniques  for  carrying  out  software  verifica¬ 
tion  have  been  developed.  This  chapter  reviews  some  of  the 
work  done  in  program  testing  and  symbolic  execution.  Since 
Partition  Analysis  uses  many  of  these  techniques  itself, 
this  chapter  also  provides  further  background  on  the  method. 

Software  Testing 

The  effort  to  replace  ad  hoc  testing  practices  with 
systematic  methods  has  produced  a  number  of  basic  techniques 
that  reflect  early  attempts  to  address  the  testing  problem 
or  that  address  particular  special  cases.  The  basic  tech¬ 
niques  are  typically  categorized  as  being  either  "black  box" 
or  "glass  (sometimes  white)  box"  techniques.  Black  box 
testing  relies  strictly  on  the  specification  to  describe  the 
intended  function  of  the  software  and  to  guide  the  selection 
of  sample  data  to  test  whether  the  software  implements  the 
function  correctly.  Glass  box  testing  augments  the  informa¬ 
tion  provided  by  the  specification  with  structural  informa¬ 
tion  about  how  the  program  works.  Both  kinds  of  techniques 
are  used  in  practice;  the  two  approaches  are  complementary 
in  that  they  tend  to  detect  different  classes  of  errors 
(Pressman,  1987:  484).  Glass  box  techniques  are  more  mathe¬ 
matically  tractable  and  more  generally  effective  but  can  get 
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unwieldy,  while  black  box  techniques  are  easier  to  apply  but 
are  more  likely  to  miss  certain  types  of  errors. 

Recently  these  basic  ideas  have  begun  to  be  combined 
into  comprehensive  approaches  to  disciplined  testing  in-the- 
large.  Two  such  approaches  are  Howden 1 s  Functional  Testing 
approach  (Howden,  1986)  and  Partition  Analysis,  discussed  in 
detail  in  the  following  chapter. 

Before  discussing  specific  techniques,  some  terminology 
is  needed.  A  program  failure  is  an  discrepancy  between  the 
observed  oehavior  of  a  program  and  its  intended  behavior.  A 
program  fault  is  an  error  in  a  program  that  causes  it  to 
fail.  Failures  have  been  classified  as  either  domain  errors 
or  computation  errors.  Domain  errors  are  concerned  with 
which  execution  path  is  followed  in  the  processing  of  data. 
If  the  wrong  path  is  followed,  a  path  selection  error  has 
occurred.  If  the  data  falls  into  a  special  case  that  the 
program  fails  to  recognize  altogether,  a  missing  path  error 
has  occurred.  Computation  errors  occur  when  the  correct 
path  is  followed,  but  the  path  processes  the  data  incor¬ 
rectly.  In  practice,  one  program  fault  can  cause  many  fail¬ 
ures,  and  some  failures  can  be  of  more  than  one  type. 

Black  Box  Testing.  Black-box  methods  focus  on  exercis¬ 
ing  all  functional  requirements  of  a  program  without  consid¬ 
ering  its  implementation.  The  basic  idea  is  to  creak  down 
the  specification  and  identify  all  of  the  individual  func¬ 
tions  performed,  and  then  test  each  one  (Dunn,  1984:  233). 


Equivalence  partitioning  {Myers,  1979)  is  a  method  for 
partitioning  the  input  domain  into  classes  that  we  can 
"reasonably"  assume  are  processed  equivalently.  Testing  any 
value  in  a  class  then  provides  confidence  that  all  values  in 
the  class  are  processed  correctly.  The  technique  of  break¬ 
ing  up  the  input  domain  into  subdomains  is  called  domain 
analysis  and  is  used  in  other  black-box  methods  as  well. 
Domain  analysis  is  also  a  qlass-box  technique  when  applied 
to  an  implementation.  In  this  case  each  path  through  the 
program  is  considered,  and  the  conditions  that  the  input 
must  meet  in  order  for  that  path  to  be  followed  constitute 
the  input  domain. 

Empirical  data  has  shown  that  more  errors  tend  to  occur 
at  the  boundaries  of  an  input  domain,  so  these  boundaries 
should  be  exercised  more  fully.  Once  equivalence  partition¬ 
ing  is  done,  test  cases  are  chosen  that  lie  just  inside  each 
class,  just  outside  each  class,  and  somewhere  in  the  "mid¬ 
dle"  (Pressman,  1987:  486).  Extensive  work  has  been  done 
showing  how  close  to  the  boundaries  to  get  and  how  effective 
this  technique  is,  particularly  the  qlass-box  version  {White 
and  Cohen,  1980;  Clarke,  Hassell,  and  Richardson,  1982). 

Fault  seeding  is  a  black-pox  technique  that  deliber¬ 
ately  outs  errors  into  a  program  in  order  to  judge  the  ef¬ 
fectiveness  of  the  testing  being  done.  In  this  technique  a 
number  of  errors  are  deliberately  introduced  and  then  test¬ 
ing  by  some  other  tecnnique  is  done.  If  n  faults  were 


seeded,  m  total  Faults  \or.>  expose  i,  and  c  of  tne  iaults 
found  A'ere  seedea  faults,  tnen  an  estimate  of  the  total  num¬ 
ber  of  faults  N  in  tne  program  is  N  »  (n  *  )  /  c  ,  and  so 

the  number  remaining  undetected  is  N  -  m  .  In  practice, 
if  the  ratio  (m  -  n)  /  (N  -  n)  is  less  than  0.9,  more  test¬ 
ing  is  cal lea  for;  values  close  to  1  provide  high  confidence 
in  the  effectiveness  of  the  tests. 

Glass  Box  Testing.  Glass-box  testing  uses  the  details 
of  the  implementation  to  guide  the  selection  of  test  data 
(Dunn,  1934  :  199).  One  important  tool  used  by  most  glass- 
oox  techniques  is  the  control  flow  graph  of  a  program.  In 
this  graph  tne  nodes  represent  statements  of  the  program  and 
edges  represent  tne  flow  of  control  from  one  statement  to 
another.  Associated  with  each  edge  is  a  condition  that  must 
oe  true  for  the  transfer  to  take  place.  For  sequential  flow 
or  an  uncond itiona 1  branch  the  condition  has  the  constant 
value  true  and  is  generally  not  shown;  sequences  of  state¬ 
ments  with  no  branches  are  sometimes  collapsed  into  a  single 
node.  The  common  if-then-else  and  loop  constructs  typically 
use  boolean  decisions  to  choose  one  of  two  possible  edges  to 
follow:  the  condition  for  one  edge  is  simply  the  negation 

of  the  other.  More  complex  control  structures  such  as  the 
case  state nent  or  tne  select  statement  in  Am  nave  more  ar¬ 
bitrary  sets  of  conditions.  Figure  2-1  snows  a  program  for 
determining  whether  a  positive  integer  is  a  prime,  and  gives 
the  corresponding  control  flow  graph. 
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function  Prime  (N  :  positive)  return  BOOLEAN  is 
Factor  :  INTEGER; 

IsPrime  :  BOOLEAN; 
begin 

if  (N  mod  2  =  0 )  or  (N  mod  3=0)  then 
IsPrime  :=  (N  <  4); 
else 

IsPrime  :=  TRUE; 

Factor  :=  5; 

while  Factor  **  2  <=  N  loop 
if  (N  mod  Factor  =  0) 

or  (N  mod  (Factor  +  2)  =  0)  then 

7  IsPrime  :=  False; 
exit; 

else 

8  Factor  :=  Factor  +  6; 
end  if; 

9  end  loop; 
end  if; 

10  return  IsPrime; 
f  end  Prime; 

a.  Ada  source  code  for  PRIME 

Adapted  from  Richardson,  1981:  24. 


b.  Control  flow  graph  for  PRIME 
Figure  2-1.  Example  Control  Flow  Graph 


A  technique  called  oasis  patn  testing  is  a  simple  way 
to  use  the  control  flow  graph  to  guide  test  data  selection 
(Pressman,  1987:  472-482).  The  graph  is  used  to  find  a  min¬ 
imal  set  of  linearly  independent  paths  through  the  graph 
such  that  each  edge  appears  in  at  least  one  path.  The  num- 
oer  of  paths  required  to  do  this  is  E  -  N  +  2  ,  where  E 
is  the  number  of  edges  in  the  graph,  and  N  is  the  number  of 
nodes.  Test  data  is  selected  that  will  cause  the  program 
execution  to  follow  each  of  these  paths. 

Loop  constructs  are  very  common  in  programs  and  have 
special  problems  associated  with  them,  so  special  methods 
for  testing  loops  have  been  devised  (Pressman,  1987:  483-4). 
Proolems  associated  with  loops  are  initialization  errors, 
indexing  or  incrementing  errors,  and  bounding  errors  at  loop 
limits.  For  a  simple  loop,  test  cases  are  devised  that  skip 
the  loop  entirely,  pass  through  the  loop  exactly  once,  ex¬ 
actly  twice,  some  "large"  number  of  times  m,  and  if  there  is 
a  maximum  number  of  times  the  loop  is  allowed  to  be  exe¬ 
cuted,  n,  then  the  loop  is  tested  for  n  -  1,  n,  and  n  +  1 
(if  possible).  Nested  loops  and  concatenated  loops  lead  to 
an  inordinate  amount  of  testing,  so  some  simplifications  are 
specified  to  control  this. 

Mutation  testing  (Dunn,  1984:  218-220)  is  a  qlass-oox 


technique  somewhat  like  fault  seeding  that  is  intended  to 
increase  confidence  in  the  anility  of  the  chosen  test  cases  •‘■q 
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to  detect  errors.  Small  changes  are  made  one  at  a  tine  to  .*,d 


the  program  to  introduce  errors  deliberately.  Each  "mutant" 
program  is  then  tested  to  see  if  a  failure  occurs.  If  so, 
then  that  test  case  has  been  shown  effective  at  finding  er¬ 
rors  and  our  confidence  that  the  original  program  is  correct 
increases.  If  the  mutant  passes  tno  test,  however,  it  is 
analyzed  to  see  if  it  is  in  fact  equivalent  to  tne  original 
program.  If  it  is  not,  then  the  test  has  failed  to  detect 
an  error,  and  our  confidence  that  the  original  program  does 
not  contain  a  similar  error  decreases  (in  fact,  it  could  be 
that  the  r.uta.nt  is  correct  and  the  original  wrong).  Muta¬ 
tion  testing  works  best  on  proqrans  that  are  believed  to  be 
basically  sound  except  for  relatively  simple  errors 
(Richardson,  1981:  94). 

The  glass-box  techniques  discussed  so  far  address  both 
domain  and  computation  errors,  but  computation  errors  m 
particular  are  prone  to  '^incidental  correctness  when  only  a 
few  test  cases  are  used,  and  so  special  techniques  for  test 
ing  for  certain  Kinds  of  computation  errors  nave  noon  devel¬ 
oped  (Howuen,  1989).  For  instance,  expressions  that  are 
polynomials  can  r>e  shown  correct  by  testing  then  with  a  nnm- 
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Comparison  of  black-  and  glass-box  techniques .  black¬ 
box  techniques  are  best  at  finding  certain  types  of  errors: 
incorrect  or  missing  functions;  interface  errors;  errors  in 
data  structures;  performance  errors;  and  initia1 ization  and 
termination  errors  {Pressman,  1987:  485).  Black  oox  tech¬ 
niques  tend  to  niss  many  other  errors  because  the  implemen¬ 
tation  is  rarely  exercised  fully:  a  single  function  in  the 
specification  may  be  implemented  as  several  special  cases, 
and  a  black  box  approach  has  no  way  of  knowing  this;  we  are 
forced  in  effect  to  guess  at  what  values  will  be  processed 
the  same  way  by  the  program,  without  looking  at  it.  "The 
disadvantage  of  the  blacx  oox  testing  approach  is  that  it 
ignores  important  functional  properties  of  programs  which 
are  part  of  its  design  and  implementation  and  which  are  not 
described  in  the  requirements"  (Howden,  1980:  162).  Thus 
th->  olack  and  glass  box  approaches  must  be  used  together  for 
Tiaxi  num  effectiveness. 

Functional^  Test  1013  (Howden,  1980;  Howden,  1986).  How- 
den's  functional  testing  and  analysis  is  an  attempt  to  rein¬ 
tegrate  tn>>  primary  alternative  approaches  to  software  test- 
inn:  static  versus  dynamic  testing,  black  versus  glass  box 

testing,  and  practical  versus  theoretical  considerations. 

It  is  an  outgrowth  of  -mtI ler  empirical  studies  of  software 
testing,  and  is  intended  to  "provide  a  framework  for  the 
iis Mssion  of  testing,  to  proviie  practical  ttieoretical  re- 


suits,  to  derive  new  results,  and  to  indicate  directions  for 
future  research"  (Howden,  1986:  997). 


The  theoretical  basis  is  provided  by  a  combined  view  of 
the  behavioral  and  structural  properties  of  programs.  At 
the  behavior  level  a  program  defines  a  mapping  from  its  in¬ 
put  domain  to  its  output  domain;  the  spoci f icat ion  of  a  pro¬ 
gram  is  intended  to  provide  a  complete  and  accurate  descrip¬ 
tion  of  this  mapping,  and  is  needed  during  testing  as  an 
"oracle"  for  determining  if  a  program  is  producing  the  cor¬ 
rect  results.  Structurally,  a  program  is  seen  as  a  collec¬ 
tion  of  f  motions  and  data  types.  Functions  act  to  convert 
one  type  of  data  into  anotaer  type,  ami  the  overall  trans¬ 
formation  of  the  input  types  to  the  output  types  defines  the 
behavior  of  the  program.  Thus  there  is  a  duality  net ween 
typ.es  .and  functions.  Program  design  methodologies  (e.g., 
functional  deco apos i t ion ,  data  flow  analysis,  Structured  De¬ 
sign  or  on jeot-or ienteg  design)  typically  choose  to  empha- 
■nz"  one  aspect  over  the  other;  Howden '  s  approac.n  to  testing 
1  l  ’.os"  •■rphasizos  t  -)•>  identification  and  analysis  of  the 
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of  each,  and  then  test  data  selection  and  the  dynamic  analy¬ 
sis  of  the  results  of  executing  the  program  on  that  data. 

The  static  analysis  step  proceeds  by  identifying  three  types 
of  units:  functionally  important  classes  of  input  and  out¬ 
put  data,  much  as  is  done  in  the  black  nox  techniques;  data 
design  structures  witnin  the  program,  which  are  subsets  of 
declared  data  structures  that  are  functionally  related;  and 
program  design  structures,  which  are  identifiable  functions 
usee:  to  design  and  implement  the  program.  Program  design 
structures  nay  or  nay  not  correspond  to  contiguous  pieces  of 
code,  and  Howden  suggests  that  any  available  design  documen¬ 
tation  (e.g.,  data  flow  diagrams,  SAPT  charts)  can  help  in 
the  identification  of  those  functions  whose  implementation 
;orresponds  to  collections  of  paths  in  the  program,  or  to 
s cattere  i  pieces  of  code. 

In  too  test  data  selection  and  execution  step.  Func¬ 
tion  il  Testing  co. i, nines  black  and  glass  nox  techniques  oy 
partitioning  tno  input  donums  not  only  of  the  program  as  a 
whole  nut  also  of  the  individual  component  functions  identi¬ 
fies  in  tno  urogra  ,'s  structure.  Howden  lof  i  nos  a  nun her  of 
rul'-s  for  identifying  to-'S"  part  i  t  ions  an  i  for  ievisini 
fault-re /ea  L  mg  test  cases  for  the  ...  FIxecut  inn  of  th-s*» 
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the  program.  Howden  compares  the  coverage  achieved  Dy  his 
net hod  and  that  of  other  glass  box  techniques,  and  finds  his 
is  more  demanding,  and  hence  more  effective  at  finding 
faults.  He  reports  empirical  studies  to  support  this  con¬ 
clusion  . 

In  effect,  Howden  claims  that  techniques  like  branch 
analysis  and  basis  path  testing  function  as  approximations 
to  Functional  Testing:  the  validating  assumption  needed  to 
make  such  techniques  work  is  that  paths  in  programs  corre¬ 
spond  to  functions.  Tnese  techniques  are  weak  Decause  this 
assumption  is  not  true  in  general.  The  true  structure  of  a 
program  is  the  functions  that  comprise  it,  regardless  of  its 
textual  structure,  and  this  is  what  Functional  Testing  fo¬ 
cuses  on.  Conversely,  it  is  precisely  the  loose  corre¬ 
spondence  oetween  functional  structure  and  textual  structure 
that  maxes  Mow  Jen's  nethod  much  more  difficult  to  automate 
than  the  other  glass  oox  techniques;  this  is  tne  major  dis¬ 
advantage  of  th*j  Functional  Testing  approach.  Without  auto¬ 
mation,  applying  the  technique  to  any  sizable  program  re¬ 
in  i ns  impractical.  How  Jen  discusses  some  possible  ap¬ 
proach's  to  automating  parts  of  nis  scheme,  but  the  area  re¬ 
mains  ooen  to  future  research. 


!?Y  iiuolic  execution 

honewhere  between  formal  verification  and  ordinary 
testing  lies  symbolic  execution  (Dunn,  1984:  136-137;  Dar- 
ringer  and  King,  1978  ;  ilowden,  1  977  ;  King,  1^76).  Symbolic 
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execution  is  a  generalization  of  the  usual  execution  model 
for  xnputer  programs  in  tnat  program  variables  nay  be  given 
symbolic  representations  of  their  values  instead  of  tne 
value  themselves.  Tne  convention  user?  in  this  thesis  is  to 
capitalize  t:v>  first  letter  of  program  variable  names  and  to 
use  all  lower  case  for  sy moo  lie  values.  For  example,  one 
.'uht  represent  the  value  of  variable  A  ny  a  and  the  value 
of  B  uy  o.  Then  after  execution  of  the  statements 

A  :  -  A  *  A ; 

B  :=  B  *  h ; 

C  :  —  A  +  ii ; 

the  vtlues  of  the  variables  would  be 

A  -  a  *  i 

Ii  —  r>  *  ii 

C  -  i  *  a  +  b  *  o 

Nor  a  1  execution  is  tne  special  case  where  A  and  B  have  ac¬ 
tual  numerical  value's  and  so  tne  oomputat ions  can  no  carried 
out .  The  relationship  between  syr.ool  ic  execution  and  regu- 
1 ir  execution  has  been  likened  to  that  between  algebra  and 
ar  itn  i  .'tin. 

If  a  program  contains  branches  or  loops,  then  Merely 
representing  the  values  of  all  program  variaoles  is  not 
enouun,  since  these  values  depend  on  which  progra  i  path  was 
followed.  During  sy  ibolic  execution  a  special  variable 
called  PC  (path  condition)  is  nn into ined  to  indicate  all  o'" 
the  conditions  that  had  to  be  true  in  order  for  the  'xecuted 
path  to  have  been  followed.  PC  is  initially  set  to  "true," 
and  then  each  tine  a  conditional  crunch  is  execute  i  PC  is 


AND-ed  with  the  condition  corresponding  to  the  branch  taken. 

For  example,  let  the  value  of  PC  be  "true,"  the  value  of  N 

be  n,  and  consider  executing  the  following  statement: 

if  N  =  1  then 
X  :=  5; 
else 

X  :=  10; 
end  i f ; 

There  are  two  possible  paths  here.  If  the  first  is  chosen, 

then  the  result  is 

PC  -  true  and  (n  =  1) 

=  (n  =  1) 

X  =5 

If  the  alternative  is  chosen,  the  values  would  be 

PC  -  true  and  (n  /=  1) 

=  (n  /=  1) 

X  =  10 

Sometimes  the  current  value  of  PC  is  enough  to  deter- 
nine  which  way  a  branch  should  go.  For  instance,  if  in  the 
above  example  some  previous  branch  had  set  PC  to  include  the 
condition  (n  <  0),  then  the  second  path  is  the  only  possible 
one  and  PC  could  be  left  unchanged:  the  information  that 
( n  /=  1 )  is  redundant.  Thus  only  "unresolved"  branches  are 
recorded  in  PC. 

Numerous  automated  systems  have  been  developed  to  do 
symbolic  execution.  One  of  the  first  is  EFFIGY  (King, 

1976).  It  supports  a  simple  (but  nontrivial)  language  with 
a  PL/I-style  syntax.  It  functions  much  lixe  an  interactive 
debugger,  proving  trace  ana  breakpoint  facilities.  It  also 
allows  the  user  to  specify  variable  values,  either  as  liter- 


als  or  symbolically,  allowing  ordinary  execution,  pure  Hyn- 
bolic  execution,  or  any  combination.  When  an  unresolved 
conditional  is  encountered  during  execution,  the  user  can 
specify  "go  true"  or  "go  false"  to  choose  which  path  to  fol¬ 
low,  or  can  specify  "assume  (P)"  to  add  the  predicate  P  to 
PC,  which  may  (or  may  not)  resolve  the  condition.  Finally, 
one  can  request  that  an  "execution  tree"  be  generated  of  all 
possible  paths  through  a  program.  Since  any  program  with 
loons  potentially  has  arbitrarily  long  paths,  this  tree  may 
in  fact  be  infinite.  EFFIGY  allows  the  user  to  specify  a 
bound  on  the  height  of  the  tree  generated.  Ocher  systems, 
DISSECT  (Howden,  1977),  ATTEST  (Clarice,  1976  ),  and  others 
typically  provide  comparable  facilities. 

The  application  of  symbolic  execution  to  program  veri¬ 
fication  is  as  follows.  First,  one  can  look  at  the  formulas 
associated  with  program  variables  and  check  that  they  are 
correct.  One  can  also  detect  cases  where  a  variable  is  used 
before  it  is  assigned  a  value,  and  other  such  computation 
errors  (Howden,  1977:  267). 

Second,  symbolic  execution  can  sometimes  reveal  that  a 
particular  path  is  not  executable  oecause  the  PC  for  that 
path  evaluates  to  false  (i.e.,  is  a  contradiction),  or  it 
may  reveal  that  some  condition  tested  on  a  path  is  in  fact 
redundant.  The  general  problem  of  proving  that  an  arbitrary 
predicate  is  a  contradiction,  or  that  one  predicate  implies 
another,  is  unsolvablc;  in  practice,  however,  it  is  fro- 
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quently  possible  to  do  this,  ideally  using  an  automated  the¬ 
orem  prover. 

A  third  use  of  symbolic  execution  is  as  an  aid  to  do¬ 
main  analysis  and  test  data  selection  (Clarke,  1976).  Each 
path  through  a  program  defines  an  input  subdomain,  and  sym¬ 
bolic  execution  of  that  path  will  generate  the  predicate  PC 
defining  that  domain.  Finally,  symbolic  execution  can  aid 
in  the  development  of  a  formal  proof  of  correctness  (King, 
1976:  391). 

Symbolic  evaluation  has  its  advantages  and  disadvan¬ 
tages.  It  appears  to  be  superior  to  ordinary  testing  over¬ 
all  and  in  particular  is  good  at  detecting  computation  and 
domain  errors,  although  it  still  fails  to  detect  most  miss¬ 
ing  path  errors  (Howden,  1977:  277).  The  use  of  formulas 
instead  of  input/output  pairs  for  some  verification  helps  to 
guard  against  coincidental  correctness.  As  an  automated  aid 
to  other  kinds  of  analysis  (domain  analysis  and  formal 
proofs),  it  has  proved  valuable. 

On  the  negative  side,  symbolic  execution  involves  sub¬ 
stantial  overhead  to  maintain  and  manipulate  large  and  un¬ 
wieldy  symbolic  formulae.  The  value  of  any  automated  system 
in  particular  will  depend  heavily  on  the  sophistication  of 
its  formula  manipulation  and  theorem-proving  capabilities. 
Symbolic  execution  also  has  the  disadvantages  inherent  to 
any  abstract  model  of  execution:  execution  of  programs  on 
real  nachines  rarely  conforms  perfectly  to  the  model  (e.g., 
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finite  precision  of  machine  computation).  For  this  reason, 
the  execution  of  programs  with  actual  data  remains  a  neces¬ 
sity.  The  ability  of  symbolic  execution  to  aid  in  the  se¬ 
lection  of  test  data  considerably  mitigates  this  disadvan¬ 
tage.  Finally,  even  executed  symbolically  a  program  can 
have  an  infinite  number  of  paths  and  path  domains,  making 
any  attempt  at  an  exhaustive  demonstration  of  correctness  as 
futile  as  in  the  the  usual  testing  paradigm. 
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III.  Partition  Ana lys i s 


Partition  analysis  is  a  technique  that  combines  both 
formal  verification  techniques  ami  resting  in  order  to  ac¬ 
quire  confidence  in  a  program's  reliability.  It  is  also 
distinctive  in  that  it  makes  extensive  use  of  both  the  spec¬ 
ification  and  the  implementation  of  a  design,  combining 
black  and  glass  oox  techniques  to  a  greater  degree  even  than 
Howden's  Functional  Testing.  Partition  analysis  can  also  be 
applied  to  two  specifications  at  different  levels  of  ab¬ 
straction  (e.g.,  a  high-level  design  and  a  detailed  design), 
making  the  technique  applicable  in  earlier  phases  of  soft¬ 
ware  development. 


Overview  of  Partition  Analysis  (Richardson,  1931:  25-50; 
Richardson  and  Clarke,  1935) 

Partition  Analysis  consists  of  three  steps:  forming 
the  procedure  partition,  partition  analysis  (formal)  verifi¬ 
cation,  and  partition  analysis  testing.  Forming  the  proce¬ 
dure  partition  is  likewise  a  three  step  process. 

First,  symbolic  execution  is  applied  to  the  implementa¬ 
tion  to  get  a  static  representation  of  it.  This  differs 
from  the  more  usual  use  of  symbolic  execution  as  an  inter¬ 
pretive  technique.  The  result  is  a  set  of  input  do  .tains 
!)[  P  ]  and  corresponding  computat ions  C[P  ),  one  such  pair 

rJ  tj 


for  each  path  P  ,  J  =  1,  2,  ...  ,  N 
piemen tat  ion  mart  it  ion . 


.  This  is  called  the  i . 1 1 — 


Next  the  specification  is  similarly  analyzed  to  produce 


the  specification  partition,  a  set  of  input  domains  D[Sj.) 

and  corresponding  computations  C(.S^),  one  for  eacn 
"subspecification"  ,  I  =  1,  2 ,  ...  ,  M.  It  is  assumed 

that  a  formal  speci f icat ion  is  available  for  this  step,  in 
some  high-level  notation.  Richardson  uses  a  specification 
Language/PDL  of  her  own  design  called  SPA  and  extended  the 
symbolic  evaluation  techniques  to  handle  this  language. 

Thus  a  subspecification  is  simply  a  path  through  the  "code" 
of  tne  specif ication. 

Once  tne  implementation  and  speci f icat ion  partitions 
nave  been  derived,  the  procedure  partition  is  formed  by  in¬ 
tersecting  all  implementation  domains  D[P?]  and  specifica¬ 
tion  domains  D [ S ^  j  to  find  all  nonempty  (overlapping)  pairs, 
D^.  Corresponding  to  each  pair  is  a  "computation  differ¬ 
ence"  C  that  represents  the  disparity  (hopefully  null)  be- 
tween  the  computations  specified  by  C [ S  )  and  C[P,].  Do- 
mains  in  one  partition  that  do  not  overlap  any  domain  in  the 
other  partition  indicate  discrepancies  between  the  implemen¬ 
tation  and  the  specification,  and  hence  probable  errors 
(possibly  missing  patn).  Analysis  can  continue  by  including 
these  domains  in  the  procedure  partition  as  elements 
(for  unpaired  specification  domains)  and  (for  unpaired 

implementation  domains). 

Partition  analysis  verification  ta<es  the  procedure 

partition  and  tries  to  prove  for  each  domain  D  using  Stan- 

L 
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dard  proof  techniques  that  C  is  null,  tnat  is,  that  C[ST) 

I J  I 

and  C[P,]  have  the  same  effects  on  all  elements  of  DTT.  Th 
equivalence  problem  is  undecidaole  in  general,  so  this  step 
lay  or  may  not  succeed  fully.  A  proof  of  equivalence,  how¬ 
ever,  is  a  very  strong  argument  for  the  correctness  of  the 
implementation.  A  proof  of  nonequivalence  identifies  a 
fault  in  the  program,  and  any  counterexamples  found  provide 
fault-revealing  test  data. 

The  last  step  is  partition  analysis  testing. 

The  partition  analysis  .’Method  complements  verification 
with  testing.  When  the  verification  process  is  unsuc¬ 
cessful,  testing  nay  uncover  errors  or  may  increase 
confidence  in  the  unproven  equality  relationships. 

When  the  verification  process  is  successful,  testing 
challenges  or  supports  the  conclusions  drawn  in  the 
postulated  environ. Mont  of  partition  analysis  verifica¬ 
tion  (Ricnaruson  and  Clarke,  1985:  1483). 

In  this  stop,  the  procedure  partition  subdomains  and  compu¬ 
tations  are  used  to  generate  test  data.  Glass  box  tech¬ 
niques  for  testing  arithmetic  Manipulations ,  special,  ex¬ 
tremal  and  nonoxtro mal  cases  are  applied  to  too  computa¬ 
tions,  while  black  hox  techniques  such  as  domain  testing  ar 
applied  to  the  domains.  Because  both  implementation  and. 
specification  domains  are  be  i  ng  consi  hired,  do  .-in  in  testing 
will  not  only  pick  up  path  selection  errors  but  also  so.no 
missing  path  errors  as  well. 

Ijcpo m  Ana  1  y  ;  i  s 

In  order  to  derive  static  expressions  for  proqra 
paths,  Richardson  had  to  develop  techniques  for  analyzing 


ir.d  representing  loops.  The  inscription  that  follows  is 
taken  from  Richardson's  dissertation  (Richardson,  19B1),  but 
has  been  modified  to  ;ia<e  certain  aspects  clearer. 

The  most  general  loop  structure  has  the  fori 
loop 

loop- bo  iy ; 
end  loop; 

where  loop-body  is  a  set  of  paths,  some  of  which  exit  and 
some  of  which  do  not.  Those  that  do  all  branch  to  just  past 
t.oe  end  of  the  loop;  those  that  do  not  all  branch  back  to 
the  top.  Thus  the  loop  is  a  single-entry,  single-exit  con¬ 
struct  (whether  this  general  form  is  a  structured  construct 
can  no  argued  either  way).  The  standard  while-loop  can  be 
written  in  this  Corn  as 
loop 

if  not  while-condition  then  exit; 

loop-body ; 
end  loop; 

Richardson's  loop  technique  depends  upon  knowing  a  pri¬ 
ori  the  complete  sequence  of  paths  followed  through  the 
loop.  This  is  a  very  strong  condition  that  is  frequently 
not  net;  thus  the  technique  is  not  at  all  general.  In  many 
cases  this  condition  is  net,  however,  such  as  when  a  loop 
has  only  one  path  through  it  that  stays  in  the  loop  and  all 
the  other  paths  exit  it.  Another  workable  case  is  when, 
say,  tae  first  iteration  through  the  loop  follows  one  path 
and  all  subsequent  iterations  follow  another. 

Figure  3-1  snows  an  exa  nplo  of  a  loot)  that  cannot  be 
analyzed  this  way.  Procedure  SEARCH  nor forms  a  binary 


search  for  ole-nent  X  in  a  sorted  array  A  m  l  r  ‘turns  its  in¬ 
dex  if  found .  On  each  iteration  of  the  loop,  X  -in  y  oe  four,  I 
or  it  may  bo  determined  not  to  oo  present ;  in  cither  ease 
the  loop  is  exited.  Too  pat  (is  stay  in  the  loot,,  corres pond- 
i  ng  to  X  heinq  loss  than  or  jro-jfr  than  tile  current  element 
oeinq  examine'! .  One  cannot  determine  the  sequence  of  these 
t  vo  paths  that  -fill  be  followed  without  Knowing  at  least 
some  of  the  data  values;  hence,  this  loop  cannot  oe  analyzed 
by  Richardson's  technique. 


procedure  Search  (A  :  in  ARRAY_TYPE ; 

X  :  in  ELEMENT_TYPF; 

Found  :  out  BOOT. FAN; 

Whore  :  out  INDEX_TYPi:)  is 
LB  ;  I NDFX_TYPE  :  =  A' First; 

!JB  :  I  NDFX_TY  PE  :=  A' Last; 

Index  :  INDEX_TYPF ; 
oeqin 

Found  FALSE; 

•while  I.B  =  'JB  loop 

Index  :=  (LB  +  U 13 )  /  2; 
if  X  <  A (Index)  then 
'JB  Index  -  1; 
elsif  X  >A( Index)  then 
LB  :=  Index  +  1; 
else 

Found  TRUE; 

Where  :=  Index; 
exit; 
end  if; 
end  loop; 
end  Search; 

Figure  3-1.  An  un-analyzable  loot) 

Adapted  fro. a  Darrinqor  and  Kinq,  197R:  31 


Assuming  that  a  loop  is  analyzablo,  the  first  stop  is 
to  associate  with  it  an  iteration  counter  <  to  count  tno 
number  of  loop  iterations.  Each  path  of  t-w  oody  of  tno 
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•  i*  ns  throj.j)  -  s:,"  :i  f  i.Mtion,  cor  respond - 

..•••  -:s  ions  of  L.i"  oiso  state. sent.  Note 
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if  ilanses,  so  tut  ill  the  preceding  oondi- 
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f  urv.t  ton  Pri  >f>  (N  :  INTEGRA  range  2 .  .  INTEGER  '  I.  A. ST ) 
return  HOoLFA*:  i  s 
F-i  :tor  :  I  NTKGK  < ; 

I  s  i'ri  •  •  :  BOOLEAN; 

s  o<-gin 

1  if  ( N  no.  i  2-0)  or  ( \’  i.o,!  3-0)  tnon 

2  Is  On  (  \  <  4  )  ; 

el  5  • 

3  IsPri-.n  PdF  F ; 

4  Factor  :-  h ; 

5  A’nile  Factor  **  2  <  -  N  loop 

6  if  ( N  "ioJ  Factor  -  ;>) 

or  (N  cod  ( Factor  +  2)  -  0)  then 

7  Is Pri mo  :-  FALSE; 
e  xit; 

-Is,' 

4  Factor;-  Fa  ctor  +  (<  ; 

■no  if; 

h  <  >n  i  1  oo;> ; 

on  i  if ; 

In  return  IsPrioe; 

o.n  j  "ri  ae  ; 


Figure  3-3.  [’plciontatron  of  Pit  IMF 
Adaptec  fro. a  Ficharison,  1081;  24 


i  ms  attache  i  to  the  three  oaths  reflect  this  semantics. 

:»•••  only  output  variable  is  the  value  of  the  function, 

IS;;.  In  the  first  two  cases  tnis  value  is  given  exnlic- 
-  1  / .  In  tn  :•  last  case,  it  is  represented  by  a  formula. 

T;»e  i.'u»le  iont.it  ion  of  PRIME  contains  a  loop,  and  Fig- 
res  3-6  an!  3-7  give  i ntor red ia tm  steps  of  the  loon  analy- 
is  reguirei  to  get  an  expression  for  it:.  There  are  three 
aths  through  the  loon,  cor respon  ling  to  the  while  condition 
valuating  to  false,  the  if  condition  being  true,  and  the 
Iso  condition  being  true.  The  first  two  conditions  result 
n  the  loop  we  inn  exited,  wnile  tno  last  on,;  stays  in  the 
non,  Figure  3-6  gives  symbolic  representations  for  the 


SI 

D[S1] 
C[S1  ] 


(s,  1,  2,  f) 
(n  =  1) 

Prime  =  false 


52  :  (s,  1,  3,  4,  f) 

Of  S2  ]  :  (n  =  2) 

C[S2]:  Prime  =  true 

53  :  (s,  1,  3,  5,  6,  f) 

D[ S3  1  :  (  n  >=  3  ) 

C [ S  3  ]  :  Prime  =  fora  11  {i  :=  2..n-l  I  n  mod  i 


r'iMuf  3-3.  -i:  i  f  ioati on  Partition  of  par;  . 
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PC 
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[sPr  v 


'  "*-l 

I  sPr  i  no 


i  n  :  (  Fact  ->r 


<-L 


:<-l 


Factor.  -  Factor  . 

<  -c  —  l 

lec  -  true 


>  '.) 


path  (5,  6,  7,  9) 

PCk  *■  PC  ^  ami  (Factor^  **  2<-  N)  and  (  ( N 

Factor  .  -  0)  or  (N  .nod  (Factor  .  +  2) 
K-l  K.-1 

IsPn.ne,  -  false 

K. 

Factor  -  Factor  , 
x  it  - 1. 

lec.  =  true 

k 

oath  (  5  ,  (> ,  H  ) 

PC  =  PC.  .  and  (Factor  ,  **  2  <-  N)  and  (N  . 

<  —  L  <  —  I 

Factor  ,  /=  0 )  and  (  \i  a.o.  i  ( Factor  .  +  2)  , 

•;  -  l  <  - 1 

IsPri  in  -  IsPrime.  , 
x  x- 1 

'•’actor  -  Factor,  ,  +  6 

X  X  —  1 

lec  -  false 

x 

Sure  3-6.  Symbolic  execution  of  hoop  in  PR  IMF. 
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path  condition  and  variable  values  after  scie  iteration  < , 
in  terns  of  the  values  from  the  previous  iteration.  Figure 
3-7  then  solves  tnese  recurrence  relations  in  closed  form 
for  the  special  case  where  k  =  1  and  the  general  case  where 
k  >-  1.  Once  tiie  loop  analysis  is  complete,  the  full  imple¬ 
mentation  partition  can  be  constructed.  This  is  given  in 
Figure  3-8. 

The  last  part  of  the  first  step  of  Partition  Analysis 
is  to  form  the  procedure  partition.  The  domain  of  each  suo- 
speci f ication  is  compared  with  the  domain  of  each  implemen¬ 
tation  path,  and  any  region  of  overlap  defines  a  subdomain 
of  the  procedure  partition.  For  each  subdomain,  a  computa¬ 
tion  difference  is  computed  by  comparing  the  computations 
specified  by  the  specification  and  the  implementat ion  for 


elements  in  that  subdomain.  Figure  3-9  gives  the  procedure 
partition  for  PRIME.  Note  that  subsnecif ication  does  not 
iiatcn  any  path  domain  (element  in  Figure  3-9).  This 

immediately  reveals  that  the  implementation  fails  to  con¬ 
sider  tne  case  where  N  -  1.  Note  also  that  subspecification 
overlaps  no  less  than  five  different  path  domains  (ele¬ 
ments  D , ,  ,  b,-,,  Id-,-,,  D,  .  ,  and  ),r). 

31  32  33  34  35 

Step  two  of  Partition  Analysis  is  formal  verification 
that  alL  computation  differences  in  the  procedure  partition 


are  null.  Otnor  tnan  D.  , ,  this  can  be  done  for  PRIME.  For 

L  0 

m  ,,  ,  for  exnnple,  the  domain  is  defineu  ny  (n  -  2)  and  the 


path  (5,9) 

PC  =  PC  and  Factory  **  2  >  N 
Factor  =  Factor^ 

IsPrimo  =  IsPrimeg 

path  (5,  6,  7,  9) 

PC  -  PC  and  (Factor^  **  2  <=  N)  and  ((N  mod 

Factor g  =  0)  or  (M  mod  (Factor^  +  2)  =  0) 
Factor  -  Factor^ 

IsPrirne  =  false 

path  ((5,  6,  3  )  + ,  5,  9) 

PC  =  PC  and  exists  {k  :=  2  ...  I  ((Factor^  +  6*k 
-  6)  **  2  >  N)  and  forall  {i  :=  0  ..  k-2  I 
((Factor^  +  6 * i )  **  2  <=  N)  and  (N  mod 
(FactorQ  +  6*i)  /=  0)  and  (N  mod  ( Factor Q  + 
6*i  +  2)  /=  0)}} 

Factor  =  Factor^  +  6*k  -  6 
IsPrirne  =  IsPrimeg 

path  ((5,  6,  8 ) + ,  5,  6,  7,  9) 

PC  -  PC  and  exists  {k  :=  2  ...  I  ((Factor^  +  6*k 

-  6)  **  2  <=  N)  and  ((N  mod  ( Factor^  +  6*k  -  6 

-  0)  or  (N  mod  (Factor^  +  6*k  -  4)  =  0))  and 
forall  (i  :=  0  ..  k-2  I  (N  mod  ( Factor q  +  6*i) 
/-  0)  and  (N  mod  (Factor^  +  6*i  +  2)  /=  0))} 

Factor  =  Factor^  +  6*k  -  6 
IsPrirne  =  false 


Fiqure  3-7.  Loop  Expression  in  PHI ME 


[Jl 


D[P1]:  (n  >=  2)  and  (n  mod  2  =  0  or  n  mod  3=0) 

C [ PI  ]  :  Prime  =  ( n  4  ) 

P2  :  (s,  1,  3,  4,  5,  9,  10,  f) 

D [ P2  3 :  (n  >=  2)  and  (n  <  25)  and  (n  mod  2  /=  0 ) 
and  ( n  mod  3  /=  0 ) 

C [ P2 ] :  Prime  =  true 

P3  :  (s,  1,  3,  4,  5,  6,  7,  9,  10,  f) 

D [ P 3  3 :  (n  >=  25)  and  (n  mod  2  /=  0  ) 

and  (n  mod  3  /=  0 )  and  (  ( n  rnod  5  =  0) 
or  ( n  mod  7=0)) 

C[P3]:  Prime  =  false 

P4  :  (s,  1,  3,  4,  (5,  6,  8)  +  ,  5,  9,  10,  f) 
D[P4]:  (n  >=  25)  and  (n  mod  2  /=  0) 

and  (n  mod  3  /=  0)  and  exists  {k  :=  2  ... 

(6*k  -  1)  **  2  >  n  and  forall{i  :=  l..k-l 

( 6* i  -  1)  **  2  <=  n) 

and  (n  mod  (6*i  -  1)  /=  0) 

and  (n  mod  (6*i  +  1)  /=  0) 

C[P4]:  Prime  =  true 

P5  :  (s,  1,  3,  4,  (5,  6,  8)+,  5,  6,  7,  9,  10,  f 
D[P5]:  (n  >=  121)  and  (n  mod  2  /=  0 ) 

and  (n  mod  3  /=  0)  and  exists  {k  :=  2  ... 

( 6*k  -  1 )  **  2  <=  n) 

and  ((n  mod  (6*k  -  1)  =  0) 

or  (n  mod  (6*k  +  1)  =  0)) 
and  forall  {i  :=  1  ..  x-1  I 
(n  mod  (6*i  -  1)  /=  0) 
and  (n  mod  (6*i  +  1)  /=  0) 

C[P5l:  Prime  =  false 


Figure  3-8.  Implementation  Partition  of  PRIME 


two  computations  are  the  constant  value  (true)  and  the  pred 
icate  (n  <  4),  which  evaluates  to  true  within  the  domain. 
Proofs  Cot  the  other  domains  are  more  complicated,  and  are 
not  presented  here. 

Step  3  of  Partition  Analysis  is  test  data  selection,  t< 
further  reinforce  confidence  in  the  correspondence  between 


CIO:  (false)  vs.  nothing 

D2 1 :  (n  =  2) 

C21:  (true)  vs.  (n  4) 

D31:  (n  >=  3)  and  ((n  mod  2=0)  or  (n  mod  3=0) 

C31 :  (forall  {i  :=  2..n-l  I  n  mod  i  /=  0} 
vs .  ( n  <  4 ) 

D32:  (n  >=  3)  and  (n  <  25)  and  (n  mod  2  /=  0 ) 
and  ( n  mod  3  /=  0 ) 

C32:  (forall  (i  :=  2..n-l  I  n  mod  i  /=  0} 
vs.  (true) 

D33:  (n  >=  25)  and  (n  mod  2  /=  0 ) 

and  (n  mod  3  /=  0 )  and  {(n  mod  5=0) 
or  ( n  mod  7=0)) 

C33:  (forall  {i  :=  2..n-l  I  n  mod  i  /=  0} 
vs .  ( false ) 

D34:  (n  >=  25)  and  (n  mod  2  /=  0 ) 

and  (n  mod  3  /=  0 )  and  exists  {k  :=  2  ...  I 

(6*k  -  1)  **  2  >  n  and  forall{i  :=  l..k-l  I 

(6*i  -  1)  **  2  <=  n) 

and  (n  mod  (6*i  -  1)  /=  0) 

and  (n  mod  (6*i  +  1)  /=  0) 

C34:  (forall  (i  :=  2..n-l  I  n  mod  i  /=  0} 
vs.  (true) 

D35:  (n  >=  121)  and  (n  mod  2  /=  0) 

and  (n  mod  3  /=  0)  and  exists  {k  :=  2  ...  I 

( 6*k  -  1)  **  2  <=  n) 

and  ((n  mod  (6*k  -  1)  =0) 

or  (n  mod  (6*k  +  1)  =  0)) 
and  forall  {i  :=  1  ..  x-1  t 
( n  mod  ( 6 *i  -  1 )  /=  0 ) 
and  (n  .nod  (6*i  +  1)  /=  0) 

C35:  (forall  {i  :=  2..n-l  I  n  mod  i  /=  0} 
vs .  (false  ) 


Figure  3-9.  Procedure  Partition  of  PRIME 


tne  specification  and  implementation  (especially  in  the  cas 
of  a  failure  in  the  formal  verification  step),  and  also  to 
demonstrate  the  run-time  behavior  of  the  program.  Two  cri- 
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teria  for  selection  are  used,  one  for  domain  testing  and  one 
for  computation  testing.  Domain  testing  focuses  on  the 
boundaries  between  subdomains  defined  in  the  procedure  par¬ 
tition  and  chooses  test  data  for  each  domain  that  is  both  ON 
a  boundary  (and  hence  in  the  domain)  and  also  data  that  is 
OFF  the  boundary  and  not  in  the  domain.  OFF  points  are  cho¬ 
sen  to  be  as  close  to  the  boundary  as  possible,  to  minimize 
the  maximum  boundary  displacement  that  would  go  undetected. 
Since  PRIME  deals  with  integer  values  only,  OFF  points  can 
be  selected  that  are  immediately  adjacent  to  each  boundary. 

Computation  testing  criteria  focus  on  the  computations 
performed  within  each  domain  and  help  to  verify  that  the 
computation  difference  for  each  domain  is  null,  even  if  the 
formal  verification  step  failed.  Details  of  the  proof  (or 
attempted  proof )  often  provide  guidance  for  finding  good 
test  points.  Pho  specific  algebraic  properties  of  the  co 
mutations  will  also  dictate  which  test  data  will  oe  se¬ 
lected.  Figure  3-10  gives  some  examples  of  the  test  data 
that  would  be  selected  for  the  subdomains  of  PRIME. 

Performance  of  Partition  Analysis 

To  get  some  idea  for  the  effectiveness  of  Partition 
Analysis,  Richardson  used  the  technique  on  a  set  of  34  mod¬ 
ules  from  the  programming  literature  and  textbooks,  provid¬ 
ing  specifications  for  then  as  needed  (Richardson  and 
Clarxe,  1935:  L4R6-148B).  Most  of  the  modules  were  correct 
or  had  only  a  few  errors,  so  mutation  analysis  was  used  to 
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DIO:  Domain  Testing  Criterion: 

N  -  1  (on),  N  =  0,  N  -  2  (off) 
Computation  Testing  Criterion: 
N  =  1 

D21:  Domain  Testing  Criterion: 

N  =  2  (on),  N  =  1,  N  =  3  (off) 
Computation  Testing  Criterion: 
N  =  2 


D31:  Domain  Testing  Criterion: 

N  -  2  (off),  N  =  3,  N  -  4  (on) 

'4  =  5,  N  =  7  (off),  N  =  6 ,  14  =  9  (on) 
Computation  Testing  Criterion: 

M  ='  3  r  N  —  4 ,  N  =  1000 


Figure  3-10.  Sample  Test  Data  for  PRIME 


generate  large  numbers  of  "mutant"  variations  of  each  iu  xi- 
ule,  each  with  one  seeded  error.  Partition  analysis  suc¬ 
cessfully  detected  all  of  the  errors  that  led  to  incorrect 
programs.  There  were  a  few  mutants  that  correctly  executed 
all  the  test  data  generated  by  the  Partition  Analysis  proce 
dure,  and  in  eacii  case  it  was  shown  that  the  mutant  program 
was  in  fact  equivalent  to  the  correct  one.  Richardson  au¬ 
dits  that  tnis  evaluation  is  neither  as  rigorous  nor  as  com 
piste  as  it  could  be,  but  her  results  argue  favorably  for 
the  effectiveness  of  the  technique. 


The  next  chapter  explores  extensions  to  Partition  Anal 
ysis  tnnt  expand  its  applicability  by  allowing  its  use  on 
programs  containing  procedure  and  function  calls.  The  case 
of  recursive  procedures  and  functions  is  also  considered. 
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IV.  Extensions  to  Partition  Analysis 


This  chapter  describes  how  Partition  Analysis  can  be 
extended  to  apply  to  programs  that  use  procedure  and  func¬ 
tion  calls.  The  first  section  presents  several  approaches 
to  the  general  problem,  while  the  second  section  addresses 
the  special  case  of  recursive  procedures  and  functions.  It 
is  assumed  that  all  procedures  and  functions  have  a  single 
entry  point  and  a  single  exit.  Returns  are  treated  as 
branches  to  a  dummy  node  at  the  end  of  the  routine  to  en¬ 
force  this  convention. 

Procedure  and  Function  Calls 

During  ordinary  symbolic  execution  as  described  in 
Chapter  2,  procedure  and  function  calls  do  not  present  any 
special  problems.  When  a  call  is  encountered,  arguments  are 
bound  to  parameters,  space  for  local  variables  allocated, 
and  control  transferred  to  the  start  of  the  called  routine, 
just  as  in  normal  execution.  At  the  end,  output  values  are 
passed  back  to  the  calling  routine  and  execution  continues. 
The  fact  that  some  or  all  of  the  values  being  passed  around 
are  represented  by  symbolic  expressions  does  not  interfere 
with  this  process. 

In  Partition  Analysis,  however,  the  need  to  derive  a 
/  static  expression  for  a  program  causes  some  problems,  and 

also  presents  some  opportunities,  in  the  handling  of  proce- 


dure  and  function  calls.  The  most  direct  approach  is  to 
start  with  the  bottom-level  routines  and  derive  expressions 
for  them  using  the  methods  described  in  Chapter  3.  Then  in 
places  where  the  bottom-level  routines  are  called  one  can 
substitute  these  expressions,  making  parameter  substitutions 
and  any  simplif ications  possible.  This  process  can  be  re¬ 
peated  until  the  entire  program  has  been  analyzed. 

Figures  4-1  through  4-4  illustrate  this  approach.  Fig¬ 
ure  4-1  is  a  function  that  returns  the  number  of  days  in  a 
calendar  month.  This  and  later  examples  make  use  of  the 
predefined  Ada  package  CALENDAR  and  the  type  declaration 

type  DATE_TYPE  is  record 
Day  :  DAY_NUMBER; 

Month  :  MONTH_NUMBER ; 
end  record; 

Function  DAYS_IN  calls  a  boolean  function  LEAP_YEAR 
when  it  is  determining  the  number  of  days  in  February.  Fig¬ 
ure  4-2  gives  an  implementation  for  this  function.  Analysis 
of  LEAP_Y FAR  results  in  the  expression  of  Figure  4-3.  This 
expression  is  then  used  during  the  analysis  of  DAYS_IN  to 
get  the  expression  of  Figure  4-4. 

This  approach  is  not  without  its  disadvantages.  First 
of  all,  during  top-down  development  one  might  want  to  begin 
analysis  and  testing  of  routines  before  all  of  the  lower 
ones  are  complete.  Second,  the  implementation  of  a  low- 
level  routine  may  include  details  that  are  not  relevant  to 
the  routine  ueing  tested.  For  example,  numerical  routines 
like  SIN  or  SQRT  are  frequently  implemented  at  iterations 


function  Days_in  (Month  :  MONTH_NUMBER; 

Year  :  YEAR_NUMBEK ) 
return  INTEGER  is 

s  begin 

case  Month  is 

1  when  4  I  6  I  9  I  11  => 

2  return  30; 

3  when  2  => 

4  if  Leap_Year  (Year)  then 

5  return  29; 
else 

6  return  28; 
when  others  => 

7  return  31; 
end  case; 

f  end  Days_in; 


Figure  4-1.  Implementation  of  Days_in 


until  some  error  bound  is  met.  These  routines  certainly 
need  to  be  tested,  but  in  many  cases  an  abstract  view  of 
these  functions  is  sufficient  and  desirable.  Finally,  in 
the  case  of  a  routine  that  is  called  from  many  places,  or 
for  example  a  routine  that  is  part  of  the  definition  of  an 
abstract  data  type,  it  may  be  desirable  to  demonstrate  the 
correctness  of  the  routine  separately  one  time  (using  Parti¬ 
tion  Analysis  if  it  has  a  formal  specification,  or  some 
other  metnod),  and  then  use  some  other  way  of  referring  to 
its  function  when  it  is  called. 

There  are  at  least  two  ways  to  represent  a  procedure  or 
function  without  presenting  all  the  details  of  its  implemen¬ 
tation.  First,  one  can  use  a  formal  specification  that  has 
been  analyzed  using  Partition  Analysis.  Specifications  are 
written  at  a  higher  level  of  aostraction  and  are  usually 
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function  Leap_Year  (Yr  :  YEAR_NUMHER ) 

return  BOOLEAN  is 

s  begin 

1  if  Yr  mod  4  00  -  0  or  (Yr  mod  4  ■ 

and  Yr  nod  100  /-  0)  tn. 

2  return  TRUE; 
else 

3  return  FALSE; 
end  i f ; 

f  end  Lean  Year; 


Figure  4-2.  I  nplementation  of  Leap  Y<*ir 


PI  :  (s,  1,  2,  f) 

D { P 1  ]  t  ( yr  nod  400  =  0)  or  ((yr  mod  4 
and  (yr  mod  100  /-  0)) 

C [ P 1 1 ;  Leap  Year  -  true 


-  0) 


P2  :  (1,  2,  3,  f) 

D[P2]:  (yr  mod  400  /=  0)  and  ((yr  .nod  4  /-  0 ) 
or  (yr  mod  100  =  0)) 

C[P2):  Leap_Year  -  false 


Figure  4-3.  Implementation  Partition  of  Leap_Year 


much  simpler  than  the  corresponding  implementation.  A  spec¬ 
ification  is  ideally  also  available  before  a  module  is  writ¬ 
ten,  permitting  the  analysis  of  incomplete  programs.  In 
practice  the  use  of  a  speci f icat ion  expression  is  no  differ¬ 
ent  than  using  the  implementation  itself. 

An  alternative  approach  is  even  more  abstract.  If  a 
routine  is  not  yet  written  or  even  formally  specified,  or  if 
it  defines  a  well-known  function  (such  as  SIN  or  SQRT),  it 
may  be  sufficient  to  represent  it  symbolically  and  give  no 
indication  at  all  during  analysis  of  how  it  works.  This  ap¬ 
proach  is  also  appropriate  for  example  in  the  case  of  an 


d[  FI  ]  :  ( month  in  (4,  6,  9,  11}) 

’  I  i1 1  )  :  Days_  l  n  -  30 

:  (  s ,  3  ,  4  ,  5 ,  f  ) 

dlf‘2’:  (  nonth  -  2)  and  ((year  moa  400  =  0 )  or 

((year  moa  4-0)  and  (year  ;nod  100  /=  0))) 
‘  i  2  I  :  Days_in  -  2  9 

iM  :  (s,  1,  4,  (),  f) 

. 1  (  t  3  )  :  (  uonth  -  2)  and  (year  nod  400  /=  0  )  and 

( (year  non  4  /-  0)  or  (year  nod  100  =  0)) 

’ !  P  i  )  :  0  i  y  s_ i n  -  2  3 

04  :  <s,  7,  f) 

[  i“4  )  :  (nonth  in  (1,  I ,  0,  7,  8,  10,  12}) 

?  l  * ’ 4  )  :  Days_.i  n  -  3  i 


i  i  jar-.*  4-4.  Imp  1  omenta t ion  Partition  of  Days_in 

tost  r  .int  lata  type  or  widen  the  correctness  of  a  routine  has 
<•••■11  s*-p  irately  estaul isaeu  . 

In  i  s  vproaca  is  easily  implemented  for  function  calls. 
•»  function  F  is  called,  its  parameters  are  replaced  oy 
.h*  a  r  j  u  ‘t-nt.  s  given  out  tae  value  returned  by  F  is  repre¬ 
sented  sy  nool  ically.  For  exa..i;»le,  if  variable  X  nas  the 
■  1 1  .a ■  •  a  :>  ,  tnen  after  the  statement 
Y  :  -  F  (  X  )  ; 

.a-  variable  i  vould  have  trie  value  F  (a  +  i> )  . 

fhis  ipproacn  is  also  illustrated  in  Figures  4-5  and 
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function  Days_Between  (Datel,  Date2  :  DATE_TYPE; 

Year  :  YEAR_NUMBER) 
return  INTEGER  is 
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Difference  :  INTEGER; 

From,  To  :  DATE_TYPE; 
begin 

if  Datel. Month  =  Date2. Month  then 
Difference  :=  abs  (Datel. Day 

-  Date2 . Day ) ; 

else 

if  Datel. Month  <  Date2. Month  then 
From  :=  Datel; 

To  :=  Date2; 
else 

From  :=  Date2; 

To  :=  Datel; 
end  if; 

Difference  :=  0; 

for  Mon  in  From. Month  ..  To. Month  -  1 
loop 

Difference  Difference 

+  Days_in  (Mon,  Year); 

end  loop; 

Difference  :=  Difference  ♦  To. Day 

-  From. Day; 

end  if; 

return  Difference; 
end  Days_Between ; 


Figure  4-5.  Implementation  of  Days_Between 


I  i  rally.  Note  tnat  if  tne  i  -.ole  •>entat  ion  of  \  nan 

>'n  an  -  I  ins  tea  i,  t.nen  tne  loop  in  D  A  Y  K  _ Wi  V.'  E  N  won  l  ■  n  »v-.: 
i  three  patns  tnat  remained  in  the  loop,  witn  no  way  of 

I I  mg  vnic'n  path  would  be  followed  during  each  iteration, 
tnat  tne  loop  would  not  nave  been  analyzanle  uy  Partition 

a  lysis.  Tnus  t.ie  use  of  this  abstraction  technique  has 
lowtvj  a  prog r a  to  :•>•>  analyzed  that  otherwise  would  nave 
•  >n  too  co  i; .  1  icate  1 ,  assuming  tnat  the  correctness  of 
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PI  :  (s,  1,  2,  12,  f) 

D[Pl]:  ('  iate  1 .  month  =  iate2. month) 

C[P1]:  U.iys_Botween  -  abs  (datel. day  -  date2 .  d  ay ) 

P2  :  (s,  1,  3,  4,  5,  8,  9,  (10,  9)*,  11,  12,  f) 
D[P2]:  ( datel . month  <  date2. month) 

C [ P2  ]  :  Oays_Between  =  sum  (i  :=  datel. month 

date2. month  -  1  | 
days_in  (  i  ,  year  )  ) 

+  date2.day  -  datel. day 

(s,  1,  3,  6,  7,  8,  9,  (10,  9)*,  11,  12,  f) 
(datel. month  >  date2 . month ) 

Oays_Between  =  sum  (i  :=  date 2. month 
datel. month  -  1  I 
days_in  ( i ,  year  )  ) 

+  datel. day  -  date2.day 

Figure  4-6.  Implementation  Partition  of  Days_Between 
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Procedure  calls  can  also  be  treated  symbolically,  but 
the  notation  is  necessarily  different.  The  proposed  nota¬ 
tion  /.'ill  allow  a  procedure  to  be  represented  functionally 
so  that  analysis  can  proceed. 

A  procedure  can  ne  viewed  as  a  function  that  has  an  in¬ 
put  vector  ( X ^ ,  ...  ,  )  and  produces  an  output  vector  (Y^, 

...  ,  Y.j).  Parameters  of  mode  "in"  are  part  of  the  input 
vector,  and  pariueters  of  mode  "out"  are  part  of  the  output 
vector.  Parameter  mode  "in  out"  is  in  effect  shorthand  for 
an  element  that  holds  an  input  value  and  will  also  receive 
an  output  value.  Such  parameters  are  "split"  and  listed  in 
ooth  vectors.  The  output  vector  can  be  viewed  as  an  un¬ 
named  record  type  that  the  procedure  "returns". 

To  represent  a  procedure  call  symbolically,  the  in-put 
arguments  are  bound  to  the  procedure's  input  parameters. 


and 


the  values  of  tne  output  parameters  are  represented  oy  tne 
procedure  name  and  an  output  parameter  name,  using  tne  usual 
syntax  for  representing  components  of  a  record.  For  exam¬ 
ple,  the  predefined  Aaa  pacxage  CALENDAR  include  a  procedure 
with  tne  specification 

procedure  SPLIT  (Date  :  in  TIME; 

Year  :  out  YEAR_NUMBER; 

Month  :  out  MONTH_N'l)MBER ; 

Day  :  out  DAY_NUMBER; 

Second  :  out  DAY_DURA T ION ) ; 

A  call  to  this  proceuure  might  look  li*.e 

Split  ( ?odays_Date ,  Current_Year ,  Current_Montn , 
Current_Day,  Current_Time ) ; 

The  effect  of  this  call  in  the  proposed  notation  would  oe 

Current_Year  =  Split. Year  (v  ( Today s_Da te ) ) 
Current_Month  =  Split. Month  (v  ( Todays_Date ) ) 
Current_Day  =  Sol it. Day  (v  ( Todays_Date ) ) 
Current_Time  =  Split. Second  (v  ( Todays_Date ) ) 

where  v  ( Todays_Date )  stands  for  the  symbolic  value  of  To- 

days_Dato  at  the  time  of  the  call.  Analysis  can  now  proceed 

using  these  values. 

Recursive  Procedures  and  Functions 

If  a  procedure  or  function  is  recursive,  it  cannot  oe 
analyzed  airectly  oy  ordinary  Partition  Analysis.  A  simple 
tecnnique,  however,  permits  analysis  in  many  cases. 

The  apnroacn  to  take  is  analogous  to  the  loop  analysis 
tecnnique  presented  in  Chapter  3.  All  pat ns  through  the 
routine  are  identified  and  symbolical  ly  executed.  Recursive 
:alls  are  represented  symbol ical ly  as  described  above.  Tne 
result  is  a  set  of  recurrence  relations  consisting  of  a  nuv 


ber  of  base  cases  (paths  without  recursive  calls)  and  a  num¬ 
ber  of  recursive  cases  (paths  with  such  calls).  As  in  the 
case  of  loop  analysis,  these  relations  are  then  converted  to 
a  closed-form  expression  to  be  used  if  desired  wherever  the 
routine  is  called,  or  in  the  further  application  of  Parti¬ 
tion  Analysis  (i.e.,  the  formal  verification  and  testing 
procedure)  to  the  routine  itself. 

A  short  example  of  this  kind  of  analysis  follows.  Fig¬ 
ure  4-7  is  a  recursive  function  for  computing  factorials. 
Figure  4-8  gives  the  recurrence  relations  derived  from  the 
symbolic  execution  of  this  routine,  including  the  notation 
for  representing  recursive  calls.  For  this  example,  it  is 
then  trivial  to  siiow  that  these  relations  correspond  to  the 
standard  definition  of  factorial: 

Factorial  (N)  -  product  (i  :=  1  ..  N  I  i) 

Restrictions  analogous  to  those  on  loops  apply  to  the 
analysis  of  recursive  procedures  and  functions.  If  a  recur¬ 
sive  routine  has  several  paths  that  contain  recursive  calls 
such  that  the  sequence  of  paths  followed  is  data  dependent, 
then  the  recurrence  relations  derived  for  the  routine  will 
not  be  solvable  (they  nay  not  be  anyway).  For  example,  a 
routine  that  searches  a  binary  search  tree  by  calling  itself 
recursively  on  either  the  left  or  right  subtree  until  either 
the  value  being  sougnt  or  a  leaf  node  is  found  cannot  be 
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function  Factorial  (N  :  INTEGER)  return  INTEGER  is 
s  begin 

1  if  (N  =  0)  or  ( N  =  1)  then 

2  return  1; 
else 

3  return  N  *  Factorial  (N  -  1); 
end  if; 

f  end  Factorial 


Figure  4-7.  Implementation  of  Factorial 


PI:  (s,  1,  2,  f) 

PC  =  PC  and  ((n  =  0)  or  (n  -  1)) 
Factorial  =  1 

P2:  (s,  1,  3,  f) 

PC  -  PC  and  (n  /=  0)  and  (n  /=  1) 
Factorial  =  n  *  Factorial  (n  -  1) 


Figure  4-8.  Recurrence  Relations  for  Factorial 


fully  analyzed  because  the  sequence  of  "left”  and  "right" 
moves  cannot  be  determined  at  analysis  time. 

Of  course,  this  only  applies  if  a  closed-form  solution 
is  truly  necessary:  the  need  for  such  a  solution  is  not  al¬ 
ways  present.  Frequently  a  routine  implemented  recursively 
has  a  specification  that  is  also  recursive.  Thus  the  recur¬ 
rence  relations  alone  may  be  sufficient  to  prove  compliance 
with  the  specification  during  the  formal  verification  phase. 
In  the  example  above,  the  recurrence  relations  derived  from 
the  implementation  clearly  correspond  to  the  usual  recursive 
definition  of  the  factorial  function. 

Whetner  a  closed-form  expression  is  used  or  not,  the 


selection  of  data  during  the  testing  phase  to  tost  each  sub- 


domain,  that  is,  each  path  through  the  routine,  guarantees 
that  all  base  cases  and  all  recursive  cases  will  be  exer¬ 
cised.  Hence  the  traditional  guidelines  for  testing  recur¬ 
sive  routines  are  subsumed  by  the  Partition  Analysis  method. 

Application  of  Partition  Analysis  to  Whole  Programs 

The  purpose  of  extending  Partition  Analysis  to  include 
procedure  and  function  calls  was  to  be  able  to  analyze  en¬ 
tire  programs.  While  the  unavailability  of  I/O,  limitations 
on  loops,  and  complexity  of  the  method  when  carried  out  man¬ 
ually  preclude  any  meaningful  example  from  being  given,  this 
section  outlines  one  procedure  that  could  be  applied  once 
these  other  problems  are  solved. 

Partition  Analysis  seems  to  lend  itself  best  to  a  bot¬ 
tom-up  testing  approach.  During  unit  testing  individual 
modules  can  be  analyzed  and  compared  with  their  specifica¬ 
tions,  and  also  tested  using  suitable  driver  routines.  This 
is  not  very  different  than  current  practice;  the  point  is 
that  the  application  of  Partition  Analysis  will  make  it  more 
systematic  and  thorough.  If  a  "unit"  in  fact  contains  pro¬ 
cedures  and  functions  of  its  own,  then  they  will  need  to  be 
analyzed  first.  Further,  routines  that  are  called  by  many 
units  can  do  symoolically  executed  once  and  the  resulting 
expressions  placed  in  a  library  for  use  in  later  analyses. 

As  units  are  combined  into  larger  entities,  it  will  be¬ 
come  desirable,  perhaps  crucial,  to  switch  to  one  of  the 


noro  abstract  (and  compact)  representations  for  the  various 


low-level  routines,  either  tnrough  tno  use  of  specifications 


or  of  symbolic  representations.  The  choice  of  representa¬ 
tion  is  not  arcitrary,  however.  During  the  formal  verifica¬ 
tion  phase  in  particular,  if  a  symbolic  representation  is 
used,  there  must  be  enough  semantic  information  available  to 
manipulate  formulas  containing  that  representation.  For  ex¬ 
ample,  during  symbolic  execution  in  general  it  is  always  as¬ 
sumed  that  enough  is  known  about  operators  like  "+"  and 
".nod"  to  bo  able  to  simplify  and  compare  formulas  containing 
these  symbols;  the  same  must  be  true  for  user-defined  func¬ 
tions  and  procedures. 

How  nuch  information  is  enough  will  depend  on  the  ap¬ 
plication.  For  example,  a  function  INTEGRATE  that  computes 
definite  integrals  given  an  arbitrary  real-valued  function 
and  an  arbitrary  real  Interval  does  not  need  to  know  any¬ 
thing  about  the  function  beyond  the  basics  that  it  is  com¬ 
putable,  Fie  fined  on  the  interval,  and  returns  a  real  value. 
In  such  a  case,  use  of  a  symbolic  representation  such  as 
:UN’(X>  is  appropriate.  In  other  contexts  it  may  be 
necessary  to  know  more  about  the  function,  such  as  that 
SIN(-X)  «  -  StK(X) ,  or  that  SIN(2*X)  =  2  *  SIN(X)  *  COS(X) . 

One  source  of  information  about  a  routine  at  a  reason¬ 
ably  abstract  level  is  the  specification  partition,  as  long 
as  the  implementation  is  at  some  time  shown  consistent  with 
it.  If  even  more  information  is  needed,  the  full  procedure 
partition  itself  can  be  used.  Semantic  information  can  be 


"held  in  reserve"  until  needed,  by  using  a  sy.nbolic  repre¬ 
sentation  and  introducing  semantic  information  only  where 
needed  in  a  proof,  or  it  can  be  included  directly  by  substi¬ 
tuting  the  appropriate  expressions  into  the  formulas  of  the 
high-level  routine  being  analyzed. 

During  the  test  data  selection  step  it  is  also  advised 
to  use  some  of  the  information  gathered  during  the  analysis 
of  the  low-level  routines.  Specifically,  in  order  to  test 
fully  all  interfaces  and  all  paths  through  the  program  (up 
to  loop  iterations  and  recursive  calls),  domain  information 
from  the  procedure  partitions  of  the  low-level  routines 
should  be  used.  An  example  is  the  best  'way  to  illustrate 
this.  A  program  that  made  use  of  factorials  night  have  sev¬ 
eral  domains  whose  computation  includes  a  call  to  the  FACTO¬ 
RIAL  function  above.  If  these  calls  are  represented  synod- 
ically,  then  luring  test  data  selection  each  of  these  do¬ 
mains  should  be  further  subdivided  into  one  where  tile  argu¬ 
ment  to  FACTORIAL  is  one  and  another  where  it  is  two  or 
more.  This  inclusion  of  low-level  domain  information  at 
higher  levels  of  the  program  nelps  to  maintain  confidence 
that  the  sum  of  the  parts  is  correct,  and  not  just  the  parts 
thonselves.  It  also  results  in  a  tost  suite  that  exercises 
the  entire  program,  yet  was  developed  in  the  process  of 
testing-  the  prog  ran  incrementally. 
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V.  Conclusions  and  Recommendations 


This  chapter  offers  conclusions  concerning  Partition 
Analysis  in  terras  of  its  effectiveness  and  scope  of  applica- 
oility.  It  also  presents  recommendations  for  future  work. 


Conclusions 

The  basic  problem  of  handling  procedure  and  function 
calls  was  solved.  A  problem  many  verification  procedures 
have,  that  of  getting  extremely  unwieldy  even  for  programs 
of  modest  size,  was  also  addressed.  This  problem  was  not 
solved  --  it  seems  highly  unlikely  that  any  effective  method 
of  verification  will  be  quick  and  easy  to  apply  --  but  tech¬ 
niques  were  suggested  for  controlling  some  of  this  explosive 
increase  in  complexity.  The  straightforward  approach  of  di¬ 
rect  inclusion  of  subroutines  (in  effect,  in-line  expansion) 
quickly  gets  very  large,  as  expected,  but  abstract  represen¬ 
tations  can  greatly  simplify  in  particular  deriving  the  pro¬ 
cedure  partition.  To  the  extent  that  information  from  the 
specification  and/or  implementation  is  reintroduced,  the 
formal  verification  step  will  approach  the  complexity  of 
having  used  the  direct  representation  in  the  first  place, 
but  it  will  never  be  worse  than  that,  and  it  may  reaiain  con¬ 
siderably  simpler.  In  the  testing  phase  only  domain  infor¬ 
mation  is  reintroduced,  so  a  simplification  has  occurred 
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Looking  at  the  method  as  a  whole.  Partition  Analysis  is 
largely  language-independent.  Although  all  of  the  examples 
presented  here  were  in  Ada,  in  the  original  work  Richardson 
presented  examples  worked  out  in  Ada,  Pascal,  and  FORTRAN. 

In  the  course  of  the  work  presented  here  on  analyzing  recur¬ 
sive  routines,  examples  written  in  Pascal  and  Ada  were  suc¬ 
cessfully  analyzed.  Some  examples  in  LISP  were  also  tried, 
with  mixed  success.  The  main  difficulty  with  LISP  is  the 
application  of  semantic  information  regarding  built-in  oper¬ 
ators  such  as  CAR,  CDR,  CONS,  and  so  forth,  in  order  to  ma¬ 
nipulate  formulas  containing  them,  and  also  the  recursive 
nature  of  S-expressions ,  which  seems  to  demand  an  induction 
proof  in  most  cases.  Such  difficulties  make  analysis  harder 
but  not  impossible.  The  above  languages  and  similar  ones  by 
far  represent  the  bulk  of  the  software  being  written  today. 

Partition  Analysis  seems  best  suited  for  general  scien¬ 
tific,  engineering,  and  mathematical  applications.  Some 
specialized  applications  such  as  compilers,  operating  sys¬ 
tems,  database  and  graphics  have  special  techniques  of  their 
own  that  are  used  for  software  development.  Partition  Anal¬ 
ysis  as  a  general  tool  is  not  well  suited  to  such  areas. 
Partition  Analysis  also  seems  to  De  inappropriate  to  verify¬ 
ing  complex  human-computer  interfaces,  this  being  a  poorly- 
understood  process  that  is  still  very  much  an  art.  For  era- 
bedded  computer  systems.  Partition  Analysis  can  usefully  an¬ 
alyze  iiany  of  the  algorithms  used,  and  the  testing  phase 


will  demonstrate  the  full  range  of  runtime  behavior.  How¬ 
ever,  much  embedded  software  uses  control  structures  such  as 
interrupts,  coroutines  or  parallel  execution  that  are  beyond 
the  scope  of  the  method. 

Partition  Analysis  works  best  in  a  traditional  software 
development  life  cycle  of  requirements  definition,  specifi¬ 
cation  and  high  level  design,  detailed  design  and  coding, 
and  unit  and  integration  testing.  It  is  also  consistent 
with  the  use  of  an  object-oriented  design  methodology.  Im¬ 
plementations  of  abstract  objects  can  be  proven  consistent 
with  their  specifications  and  then  treated  as  primitive  ob¬ 
jects  during  subsequent  analysis  as  explained  in  Chapter  4. 
Partition  Analysis  is  less  well  suited  to  a  rapid  prototyp¬ 
ing  environment  where  the  requirements  are  ill-defined  and 
rapidly  changing,  since  formal  specif ications  play  such  a 
central  role  in  the  method. 

As  indicated  briefly  in  Chapter  3,  early  empirical 
studies  indicate  that  Partition  Analysis  is  extremely  effec¬ 
tive  at  detecting  program  faults.  What  makes  Partition 
Analysis  testing  superior  to  other  techniques?  Black  box 
testing  looks  at  the  input  domain  and  finds  tests  that  thor¬ 
oughly  exercise  a  correct  program,  but  possibly  not  an  in¬ 
correct  one.  It  also  ignores  some  distinctions  within  input 
subdomains  that  are  unique  to  the  implementation,  and  thus 
may  fail  to  test  some  relevant  cases.  Black  box  testing  can 
do  a  reasonable  job  of  finding  path  selection  errors,  but  in 
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general  cannot  find  missing  path  errors  or  computation  er¬ 
rors  effectively. 

Glass  box  testing,  on  the  other  hand,  looks  primarily 
at  the  implementation,  devising  tests  that  thoroughly  exer¬ 
cise  whatever  the  program  does,  but  not  necessarily  what  it 
is  supposed  to  do,  using  the  specif ication  only  as  an 
"oracle"  to  distinguish  right  from  wrong  answers.  This 
characterization  applies  to  Functional  Testing  as  well. 

Glass  box  testing  is  generally  good  at  finding  computation 
errors  and  path  selection  errors,  but  again  cannot  detect 
missing  path  errors  in  general  because  the  specification  is 
largely  ignored. 

Partition  analysis  is  more  successful  at  finding  errors 


because  it  gives  equal  weight  and  equal  effort  to  the  analy¬ 
ses  of  the  specification  and  the  implementation.  This  anal¬ 
ysis  is  followed  by  an  attempt  at  a  formal  proof,  where  dif¬ 
ficulties  or  counterexamples  will  point  out  faults,  and  also 
by  extensive  testing  of  all  relevant  domains  in  the  program, 
using  both  black  and  glass  box  techniques. 


Future  Directions 


The  original  Partition  Analysis  method  was  very  re¬ 
stricted  in  the  language  constructs  it  could  handle;  it 
needs  to  be  able  to  handle  the  full  range  of  constructs  en¬ 
countered  in  real  programs.  The  work  in  Chapter  4  on  proce¬ 
dures  and  functions  is  a  step  in  this  direction.  The  method 
desperately  needs  to  oe  tried  on  larger  examples,  to  judge 
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better  the  effectiveness  of  tne  method  as  a  whole  and  to 
validate  the  procedure  described  in  Chapter  4  for  analyzing 
whole  programs .  If  done  entirely  by  hand,  however,  this  is 
impractical.  Some  automated  tools  exist,  for  example  to  do 
symbolic  execution,  automatic  theorem  proving,  and  test  case 
execution  and  monitoring,  but  these  were  not  available  for 
this  thesis.  The  unavailability  of  I/O  and  especially  the 
limited  power  of  existing  loop  analysis  techniques  further 
complicate  matters.  Any  future  work  directed  in  any  of 
these  areas  (use  of  existing  automated  tools,  inclusion  of 
I/O  or  new  loop  analysis  techniques)  would  have  considerable 
value,  especially  if  it  included  an  analysis  of  larger  pro¬ 
grams  than  has  heretofore  been  done.  Automation  of  the  Par¬ 
tition  Analysis  procedure  itself  would  be  premature,  given 
the  heuristic  nature  of  several  parts,  in  particular  the 
verification  and  test  data  selection  steps. 
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